6 Sure Independence Screening for Ultra - High Dimensional Feature Space ∗

نویسندگان

Jianqing Fan

Jinchi Lv

چکیده

High dimensionality is a growing feature in many areas of contemporary statistics. Variable selection is fundamental to high-dimensional statistical modeling. For problems of large or huge scale pn, computational cost and estimation accuracy are always two top concerns. In a seminal paper, Candes and Tao (2007) propose a minimum l1 estimator, the Dantzig selector, and show that it mimics the ideal risk within a logarithmic factor log pn. Their innovative procedure and remarkable result are challenged when the dimensionality is ultra high: the factor log pn can be large and their uniform uncertainty condition can fail. Motivated by these concerns, in this paper we introduce the concept of sure screening and propose a fast and straightforward method via iteratively thresholded ridge regression, called Sure Independence Screening (SIS), to reduce high dimensionality to a relatively large scale dn, say below sample size. An appealing special case of SIS is the componentwise regression. In a fairly general asymptotic framework, SIS is shown to possess the sure screening property for even exponentially growing dimensionality. With ultra-high dimensionality reduced accurately to below sample size, variable selection becomes much easier and can be accomplished by some refined lower-dimensional methods that have oracle properties. Depending on the scale of dn, one can use, for example, the Dantzig selector or Lasso, the fine method of SCAD-penalized least squares in Fan and Li (2001), or the adaptive Lasso in Zou (2006). Short title: Sure Independence Screening AMS 2000 subject classifications: Primary 62J99; secondary 62F12

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sure independence screening for ultrahigh dimensional feature space

متن کامل

ar X iv : m at h / 06 12 85 7 v 2 [ m at h . ST ] 2 7 A ug 2 00 8 Sure Independence Screening for Ultra - High Dimensional Feature Space ∗

August 27, 2008 Abstract Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, estimation accuracy and computational cost are two top concerns. In a recent paper, Candes and Tao (2007) propose the Dantzig selector using L1 regularizati...

متن کامل

Sure Independence Screening

Big data is ubiquitous in various fields of sciences, engineering, medicine, social sciences, and humanities. It is often accompanied by a large number of variables and features. While adding much greater flexibility to modeling with enriched feature space, ultra-high dimensional data analysis poses fundamental challenges to scalable learning and inference with good statistical efficiency. Sure...

متن کامل

Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.

June 30, 2008 Abstract Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, estimation accuracy and computational cost are two top concerns. In a recent paper, Candes and Tao (2007) propose the Dantzig selector using L1 regularization...

متن کامل

Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.

A variable screening procedure via correlation learning was proposed in Fan and Lv (2008) to reduce dimensionality in sparse ultra-high dimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To address this issue, we further extend the correlation learning to marginal nonparametric learning. Our nonparametric independence screening is called NIS...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

6 Sure Independence Screening for Ultra - High Dimensional Feature Space ∗

نویسندگان

چکیده

منابع مشابه

Sure independence screening for ultrahigh dimensional feature space

ar X iv : m at h / 06 12 85 7 v 2 [ m at h . ST ] 2 7 A ug 2 00 8 Sure Independence Screening for Ultra - High Dimensional Feature Space ∗

Sure Independence Screening

Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.

Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.

عنوان ژورنال:

اشتراک گذاری